PhiladelphiaSYNTACTIC { PROSODIC LABELING OF LARGE SPONTANEOUSSPEECH DATA {

نویسندگان

  • A. Batliner
  • R. Kompe
  • A. Kie ling
  • H. Niemann
چکیده

In automatic speech understanding, the division of continuously running speech into syntactic chunks is a great problem. Syntactic boundaries are often marked by prosodic means. For the training of statistic models for prosodic boundaries large databases are necessary. For the German Verb-mobil project (automatic speech{to{speech translation), we developed a syntactic-prosodic labeling scheme where two main types of boundaries (major syntactic boundaries and syntactically ambiguous boundaries) and some other special boundaries are labeled for a large Verbmobil spontaneous speech corpus. We compare the results of classiiers (multi-layer perceptrons and language models) trained on these syntactic{prosodic boundary labels with classiiers trained on perceptual{prosodic and pure syntactic labels. The main advantage of the rough syntactic{prosodic labels presented in this paper is that large amounts of data could be labeled within a short time. Therefore, the classiiers trained with these labels turned out to be superior (recognition rates of up to 96%).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syntactic{prosodic Labeling of Large Spontaneous Speech Data{bases Syntactic{prosodic Labeling of Large Spontaneous Speech Data{bases

Das diesem Bericht zugrundeliegende Forschungsvorhaben wurde mit Mitteln des Bundesministers f ur Bildung, Wissenschaft, Forschung und Technologie unter dem F orderkennzeichen 01 IV 102 F/4 und 01 IV 102 H/0 gef ordert. Die Verantwortung f ur den Inhalt dieser Arbeit liegt bei den Autoren. ABSTRACT In automatic speech understanding, the division of continuously running speech into syntactic chu...

متن کامل

Syntactic-prosodic labeling of large spontaneous speech data-bases

In automatic speech understanding, the division of continuously running speech into syntactic chunks is a great problem. Syntactic boundaries are often marked by prosodic means. For the training of statistic models for prosodic boundaries large data-bases are necessary. For the GermanVerbmobil project (automatic speech{to{speech translation), we developed a syntactic-prosodic labeling scheme wh...

متن کامل

Unsupervised prosody labeling for constructing Mandarin TTS

This paper introduces an unsupervised prosody labeling method for preparing a large speech corpus used in developing a Mandarin Text-to-Speech system. Adopting a four-layer prosody hierarchy, the proposed method performs an unsupervised segmental clustering that iteratively segments spoken utterances into strings of prosodic constituents and models the patterns of the segmented prosodic constit...

متن کامل

Japanese prosodic labeling support system utilizing linguistic information

A prosodic labeling support system has been developed. Large-scale prosodic databases are strongly desired for years, however, the construction of databases depend on hand labeling, because of the variety of prosody. We aim at not automating the whole labeling process, but making the hand labeling work more efficient by providing the labelers with the appropriate support information. The method...

متن کامل

Automatic prosodic break labeling for Mandarin Chinese speech data

For corpus-based speech synthesis, large quantities of labeled speech are required. Manually labeling speech data is quite labor-intensive. Therefore, automatic speech labeling is highly desired. Prosodic break detection is one of the tasks for automatic speech labeling. In the paper, we propose an automatic break detection algorithm for mandarin Chinese speech. In this approach, we use energy ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996